AITopics | parameter lambda

Collaborating Authors

parameter lambda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2b8a61594b1f4c4db0902a8a395ced93-Reviews.html

Neural Information Processing SystemsOct-3-2025, 08:07:19 GMT

Why is the same i used here?

algorithm, estimator, glmnet, (12 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Review for NeurIPS paper: WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

Neural Information Processing SystemsJun-1-2025, 00:36:32 GMT

Weaknesses: --- Missing details about lambda While mentioned line 138, the dampening parameter lambda does not appear in the experimental section of the main body, and I only found a value 1e-5 in the appendix (l799). How do you select its value? I expect your final algorithm be very sensitive to lambda, since \delta_L as defined in eq.4 will select directions with smallest curvature. Another comment about lambda is that if you set it to a very large value k, then its becomes dominant compared to the eigenvalues of F, then your technique basically amounts to magnitude pruning. In that regards, it means that MP is just a special case of your technique, when using a large dampening value.

efficient second-order approximation, neural network compression, woodfisher, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Review for NeurIPS paper: Deep reconstruction of strange attractors from time series

Neural Information Processing SystemsJan-21-2025, 03:22:51 GMT

The paper considers the setting in which the observed time series is governed by a dynamical system. However, when the problem is cast into a machine learning setup for general time series analysis, this distinction is sometimes lost. This may be a point to mention in the broader impacts section: in many applications it is not known if the time series data of interest is governed by a dynamical system. I have some concerns about this claim: a. Could the authors provide more evidence that the learning rate should not be considered a hyperparameter ("essentially one governing hyperparameter" in line 316)? After all, in lines 189-190 the learning rate is listed as a parameter that is tuned.

deep reconstruction, dimension, time sery, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Conditional Matrix Flows for Gaussian Graphical Models

Neural Information Processing SystemsJan-17-2025, 23:01:21 GMT

Studying conditional independence among many variables with few observations is a challenging task.Gaussian Graphical Models (GGMs) tackle this problem by encouraging sparsity in the precision matrix through l_q regularization with q\leq1 .However, most GMMs rely on the l_1 norm because the objective is highly non-convex for sub- l_1 pseudo-norms.In the frequentist formulation, the l_1 norm relaxation provides the solution path as a function of the shrinkage parameter \lambda .In the Bayesian formulation, sparsity is instead encouraged through a Laplace prior, but posterior inference for different \lambda requires repeated runs of expensive Gibbs samplers.Here we propose a general framework for variational inference with matrix-variate Normalizing Flow in GGMs, which unifies the benefits of frequentist and Bayesian frameworks.As a key improvement on previous work, we train with one flow a continuum of sparse regression models jointly for all regularization parameters \lambda and all l_q norms, including non-convex sub- l_1 pseudo-norms.Within one model we thus have access to (i) the evolution of the posterior for any \lambda and any l_q (pseudo-) norm, (ii) the marginal log-likelihood for model selection, and (iii) the frequentist solution paths through simulated annealing in the MAP limit.

conditional matrix flow, gaussian graphical model, parameter lambda, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Reviews: Learning A Structured Optimal Bipartite Graph for Co-Clustering

Neural Information Processing SystemsOct-9-2024, 01:54:43 GMT

The authors propose a new method for co-clustering. The idea is to learn a bipartite graph with exactly k connected components. This way, the clusters can be directly inferred and no further preprocessing step (like executing k-means) is necessary. After introducing their approach the authors conduct experiments on a synthetic data set as well as on four benchmark data sets. I think that the proposed approach is interesting. However, there are some issues.

algorithm 1, structured optimal bipartite graph, synthetic data, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Reviews: Synaptic Strength For Convolutional Neural Network

Neural Information Processing SystemsOct-7-2024, 09:56:10 GMT

Content: This submission introduces a new framework for compressing neural networks. The main concept is to define the "synaptic strength" of the connection between an input layer and an output feature to the the product of the norms of the kernel of the input layer and the norm of the input layer. A large synaptic strength indicates that a certain input feature plays a substantial role in computing the output feature. The synaptic strength is incorporated into the training procedure by the means of an additional penalty term that encourages a sparse distribution of synaptic strengths. After training all connections with synaptic strength smaller than some threshold are fixed to zero and the network is finetuned.

convolutional neural network, output feature, synaptic strength, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback